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Abstract 

This paper deals with the optimal stopping problem under partial observa- 
tion for piecewise-deterministic Markov processes. We first obtain a recm'sive 
formulation of the optimal filter process and derive the dynamic programming 
equation of the partially observed optimal stopping problem. Then, we pro- 
pose a numerical method, based on the quantization of the discrete-time filter 
process and the inter-jump times, to approximate the value function and to 
compute an actual e-optimal stopping time. We prove the convergence of the 
algorithms and bound the rates of convergence. 
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1 Introduction 

The aim of this paper is to investigate an optimal stopping problem under partial 
observation for piecewise-deterministic Markov processes (PDMP) both from the 
theoretical and numerical points of view. PDMP's have been introduced by Davis 
[T] as a general class of stochastic models. They form a family of Markov processes 
involving deterministic motion punctuated by random jumps. The motion depends 
on three local characteristics, the flow $, the jump rate A and the transition measure 
Q, which selects the post-jump location. Starting from the point x, the motion of 
the process {Xt)t>o follows the flow t) until the first jump time Ti, which occurs 
either spontaneously in a Poisson-like fashion with rate X{^{x,t)) or when the flow 
hits the boundary of the state space. In either case, the location of the process at Ti 
is selected by the transition measure Q{^{x, Ti), ■) and the motion restarts from this 
new point Xy^. We define similarly the time until the next jump, as well as the next 
post-jump location and so on. One important property of a PDMP, relevant for the 
approach developed in this paper, is that its distribution is completely characterized 
by the embedded discrete time Markov chain Sn)neN where Z„ is the n-th post- 
jump location and S'„ is the n-th inter-jump time. A suitable choice of the state 
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space and local characteristics provides stochastic models covering a great number 
of problems of operations research, see [T| section 33]. 

In this paper, we consider an optimal stopping problem for a partially observed 
PDMP {Xt)t>o- Roughly speaking, the observation process (Yt)t>o is a point process 
defined through the embedded discrete time Markov chain Sn)neN- The inter- 
arrival times are given by (S'„)„6n and the marks by a noisy function of {Zn)nm- For 
a given reward function g and a computation horizon iV G N, we study the following 
optimal stopping problem 

sup E [g{X^)] , 

o-<Tjv 

where T/v is the A^-th jump time of the PDMP (Xf)f>o, a is a stopping time with 
respect to the natural filtration = {dJ)t>o generated by the observations (Yt)t>o- 

A general methodology to solve such a problem is to split it into two sub- 
problems. The first one consists in deriving the filter process given by the conditional 
expectation of Xf with respect to the observed information . Its main objective is 
to transform the initial problem into a completely observed optimal stopping prob- 
lem where the new state variable is the filter process. The second step consists in 
solving this reformulated problem, the new difficulty being its infinite dimension. 
Indeed, the filter process takes values in a set of probability measures. 

Our work is inspired by [2] which deals with an optimal stopping problem under 
partial observation for a Markov chain with finite state space. The authors study 
the optimal filtering and convert their original problem into a standard optimal 
stopping problem for a continuous state space Markov chain. Then they propose a 
discretization method based on a quantization technique to approximate the value 
function. However, their method cannot be directly applied to our problem for the 
following main reasons related to the specificities of PDMPs. 

Firstly, PDMPs are continuous time processes. Then, it appears natural to work 
with the embedded Markov chain (Z„, 5'„)„gN- In addition, we assume that {Zn)nm 
takes finitely many values. However, an important difficulty is that the structure of 
stopping time remains intrinsically continuous. Consequently, our problem cannot 
be converted into a fully discrete time problem. 

Secondly, the distribution of a PDMP combines both absolutely continuous and 
singular components. This is due to the existence of forced jumps when the process 
hits the boundary of the state space. As a consequence the derivation of the filter 
process is not straightforward. In particular, the absolute continuity hypothesis (H) 
of [2] does not hold. 

Thirdly, in our context the reformulated optimization problem is not standard, 
unlike in [2]. Indeed, although we obtain a reformulation similar to an optimal stop- 
ping problem for a fully observed PDMP, it involves the Markov chain (H^, 5'n)neN 
that is not the embedded Markov chain of some PDMP. Therefore, a new derivation 
of dynamic programming equations is required as we cannot use the results of [4\. In 
particular, one needs to derive fine properties of the structure of the )t>o-stopping 
times. Moreover, we construct an e-optimal stopping time. 

Finally, a natural way to proceed with the numerical approximation is then to 
follow the ideas developed in [21 [S] namely to replace the filter H„ and the inter- 
jump time Sn by some finite state space approximations in the dynamic programming 
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equation. However, a noticeable difference from [,5j lies in the fact that the dynamic 
programming operators therein were Lipschitz continuous whereas our new operators 
are only Lipschitz continuous between some points of discontinuity. We overcome 
this drawback by splitting the operators into their restrictions onto their continuity 
sets. This way, we obtain not only an approximation of the value function of the 
optimal stopping problem but also an e-optimal stopping time with respect to the 
filtration {dJ)t>o that can be computed in practice. 

Our approximation procedure for random variables is based on quantization. 
There exists an extensive literature on this method. The interested reader may for 
instance consult [5] and the references within. The quantization of a random 
variable X consists in finding a finite grid such that the projection X of X on this 
grid minimizes some norm of the difference X — X. Roughly speaking, such a grid 
will have more points in the areas of high density of X. As explained for instance 
in [HI section 3], under some Lipschitz-continuity conditions, bounds for the rate of 
convergence of functionals of the quantized process towards the original process are 
available, which makes this technique especially appealing. Quantization methods 
have been developed recently in numerical probability or optimal stochastic control 
with applications in finance, see e.g. |Hl El ITU] . 

The paper is organized as follows. Section |2] introduces the notation, recalls the 
definition of a PDMP, presents our assumptions and defines the optimal stopping 
problem we are interested in, especially the observation process. The recursive 
formulation of the filter process is derived in Section [3j In Section |4| we reduce 
our partially observed problem for the PDMP (Xi)j>o to a completely observed one 
involving the process (n„, Sn)nef>s for which we provide the dynamic programming 
equation and construct a family of e-optimal stopping times. Then, our numerical 
methods to compute the value function and an e-optimal stopping time are presented 
in Section |5] where we also prove the convergence of our algorithms after having 
recalled the main features of quantization. Finally, an academic example is discussed 
in Section [6] while technical results are postponed to the Appendices. 

2 Definition and notation 

In this first section, let us define a piecewise-deterministic Markov process (PDMP) 
and introduce some general assumptions. For any metric space E, we denote B{E) its 
Borel cr-field, B{E) the set of real- valued, bounded and measurable functions defined 
on E and BL{E) the subset of functions of B{E) that are Lipschitz continuous. For 
a, 6 G M, denote a Ab = min(a, b) and aV b = max(a, b). 

2.1 Definition of a Piecewise-Deterministic Markov Process 

Let E be an open subset of M.'^. Let dE be its boundary and E its closure and for 
any subset A of E, A'^ denotes its complement. A PDMP is defined by its local 
characteristics ($,A,Q). 

• The flow $ : M'^ X ]R+ ^ is continuous. For all t e IR+, is an 

homeomorphism and t — ^ ^{',t) is a semi-group: for all x G M"', ^{x,t + 
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s) = ^{^{x, s),t). For all x & E, define the deterministic exit time from E: 
t*{x) = mf{t > such that G dE}. We use here and throughout the 

convention inf = +cxd. 

• The jump rate X : E ^ IR+ is measurable and satisfies: 

Vx G -E, 3e > such that / X{^{x, t))dt < +oo. 

Jo 

• Finally, Q is a Markov kernel on {E,B{E)) which satisfies: 

Vx G E, Q{x,E\{x}) = 1. 

From these characteristics, it can be shown [T] that there exists a filtered probability 
space {fi,J-', iJ^t)teM.+ , (Px)xi^e) on which a process is defined. Its motion, 

starting from a point x E E, may be constructed as follows. Let Ti be a nonnegative 
random variable with survival function: 



Px(Ti > t) 



^A(x,i) ifo<t<r(a;), 
if t > t*(x), 



where ior x E E and t G [0,t*(x)], A(x,t) = Jq X{^{x, s))ds. One then chooses an 
£^-valued random variable Zi with distribution Q{^{x,Ti), ■). The trajectory of Xt 
for t <Ti is: 

f ^{x,t) if t < Ti, 
[ Zi ift = Ti. 

Starting from the point Xt^ = Zi, one selects in a similar way S2 = T2 — Ti the time 
between Ti and the next jump time T2, as well as Z2 the next post-jump location 
and so on. Davis showed [T] that the process so defined is a strong Markov process 
{Xt)t>o with jump times (T„)„gN (^0 = 0). The process Sn)neN where Z„ = Xt„ 
the n-th post-jump location and Sn = Tn — T„_i (5*0 = 0) is the n-th inter-jump 
time is clearly a discrete-time Markov chain. 



2.2 Notation and assumptions 

The following non explosion assumption about the jump-times is standard (see for 
example p^ section 24]). 



Assumption 2.1. For all {x,t) e E x R+ , J2k'^{n<t} 



< +00. 



It implies that Tk — > +00 a.s. when k — > +00. Moreover, we make the following 
assumption about the transition kernel Q. 

Assumption 2.2. We assume that there exists a finite set Eq = {xi, . . . ,Xq} C E 
such that for all x E E, one has Q{x, Eq) = 1. 

In other words, for all n G N, Z„ may only take its values in the finite set £"0- 
This assumption ensures that the filter process, defined in the next section, has finite 
dimension. This is required to derive a tractable numerical method in Section |5} 
When this assumption does not hold, one may consider a preliminary discretization 
of the transition kernel to introduce it. 
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Assumption 2.3. We assume that the function t* is bounded on Eq i.e. for all 
m G {1, . . . , g}, we assume that < t*{xm) < +oo. 

Definition 2.4. For allm & {1, . . . , q}, denote t*^ = t*{xm) (^nd assume that xi,. . . , 
Xq are numbered such that t\<t2< ■ ■ ■ <t*. Moreover, let tQ = 0. 

For any function w in B{E), introduce tlie following notation 

Qw{x) = / w{y)Q{x,dy) = '^w{xi)Q{x,Xi), = sup\w{x)\. 

•'^ i=l x(^E 

For any Lipschitz continuous function w in BL{E), denote [w] its Lipschitz constant 

\w{x) - w{y)\ 

[w\ = sup 1 1 . 

x^yi^E \x - y\ 

Assumption 2.5. The jump jump rate A is in B{E) i.e. is bounded by C\. 

Denote M.{Eq) the set of finite signed measures on Eq and M.i[Eq) the subset 
of probability measures on Eq. We equip M.{Eq) with the norm | ■ | given by |7r| = 
Y!i=i 1 71"* I where vr* denotes 7r({xj}). 



2.3 Partially observed optimal stopping problem 

We consider from now on a PDMP {Xt)t>o which initial state Xq = Zq is a fixed 
point Xf) G E. We assume that this PDMP is observed through a noise and we now 
turn to the description of our observation procedure. 

For all n G N, we assume that Sn is perfectly observed but that Z„ is not (except 
for the initial state Zq). In some examples, it seems reasonable to consider that 
the jump times of the process are observed (for instance, if the jumps correspond 
to changes of environment) and that, when a jump occurs, the actual post-jump 
location is measured with a noise. The observation process of Z„, denoted Yn is 
assumed to be of the following form: = (deterministic) and for n > 1, 

F„ = ^{Zn) + Wn, (1) 

where ip : Eq ^ W'- and where the noise (PV„)„>i is a sequence of M'^-valued, i.i.d. 
random variables with bounded density function f-^ that are also independent from 

In order to define real-valued stopping times adapted to the observation process, 
we need to consider a continuous time version of the observation process. We there- 
fore define the piecewise- const ant process {Yt)t>o with a slight abuse of notatiorj^ 
as 

+00 

j=0 

Let = {dJ)t>o be the filtration generated by (l^)i>o (the observed filtration) and 
d = {dt)t>o be the filtration generated by {Xt,Yt)t>o (the total filtration). Without 
changing the notation, we then complete these filtrations with all the P-nuU sets. 
This leads us to the following definition. 

^The quantity Yn represents the value of the process (Yt)t>o at time t ~ T„ and must not be 
confused with the value of the process at time t ~ n. 
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Definition 2.6. Denote T?^ the set of {^J)t>o-stopping times that are a.s. finite 
and for n eN, define 

= |o" G such that a <Tn a.s.j . 

For all n G N, we define the filter n„ G A4i{Eq). The quantity Iln{{xi}), denoted 
by n^, represents the probability of the event {Zn = Xi} given the information 
available until time T„ i.e. 



V^G{l,...,g}, nj, = E[l|^„=.ji?^J. 



(2) 



Finally, let G N be the horizon and g G B{E) the reward function, we are 
interested in the following partially observed optimal stopping problem 



v{n) = sup E g{X„) 



TT 



(3) 



where vr is a probability measure in Aii{EQ). The solution of our problem is then 
obtained by setting it = S^q. We will also need the following assumption about the 
reward function g associated with the optimal stopping problem. 

Assumption 2.7. The function g is in B{E) i.e. bounded by Cg and there exists 
[g]2 G such that for all i E {1, . . . , q} and t,u E [0, t*], one has: 

\g{<^{xi,t)) - g{<!>{xi,u))\ < [g]2\t - u\. 

Now, the aims of this paper are first to explicit the filter process (n„)„gN (Sec- 
tion [s]); second to rewrite the partially observed optimal stopping problem ( p| a s 
a totally observed one for a suitable Markov chain on Aii{EQ) x IR+ (Section 4.3); 
third to derive a dynamic programming equation and construct a family of e-optimal 



stopping times (Sections 4.4 and 4.5); and finally to propose a numerical method 



to compute an approximation of the value function and an e-optimal stopping time 
(Section |5]). As a starting point, we will derive, in the next section, a recursive 
construction of the optimal filter that is the key point of our approach. 



3 Optimal filtering 

The goal of this section is to obtain a recursive formulation of the filter n„. As 
far as we know, there is no result concerning the filter process for generic PDMP's. 
We may however refer to [TT] for a recursive formulation of the filter for point 
processes, that can be seen as a sub-class of PDMP's. For all n G N, we denote 
Qn = (Yq, So, ... , Yn, Sn)- The continuous-time observation process (Yt)t>o being a 
point process in the sense developed in [B], one has = cr{Qn) (see [51 page 58, 
Theorem T2]). Moreover, = (^{Zq, . . . , Zn) V 5^^^. Concerning the filter n„, 
first notice that, since it is an 5^|i^-measurable random variable, there exists for all 
n G N a measurable function TTn : (M'^ X M+)"+i ^ -Mi(^o) such that n„ = 7r„(^„). 
As in the case of the Kalman-Bucy filter, the iteration leading from n„_i to n„ 
can be split into two steps : prediction and correction. For all n > 1, let /i~ be 
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the conditional distribution of Sn) given dr^-i- Thus, /x^ is a transition kernel 
defined on (M*^ x M+)" x B{Eo x M+) for all j G {1, . . . , g} and 7„_i G (M^ x 
by 

/i;;;(7n_i,{xj},rfs) = P(z„ = Xj,S'„ g c/s|^„„i = 7„-i). (4) 

Lemma 3.1. For all'-jn-i G (]R"'x]R+)", we have the following equality of probability 
measures on Eq x 'R'^ x ]R+, for all j G {1, . . . , q}, 

P{Zn = Xj,Yn G dy,Sn G (is|^„_i = 7„_i) = /i;;;(7„_i, {xj}, - ip{xj))dy. 

Proof Set h in B{Eo x R"' x IR+), using Eq. ([T]) that defines Yn, one has 



E 



h{Zn, Yn, Sn) Qn-l — 7n-l 

y-n-l — 7n-l I 

^ y ip{Xj) + S)P(Z„ = Xj, Sn G (is, G = 7n-l). 



Moreover, Wn is independent from a{Zn, Sn) V = (^{Zn, Sn, Gn-i) and admits 

the density function fw- One has then 

E ^h{Zn, Yn, Sn) Gn-1 = 7n-l 
1 r 

= H / H^j^ Vi^j) + '^^ s)P{Zn = Xj, Sn G ds\Qn-l = Jn-l)fw{w)dw 

i=i 

1 r 

= ^ / h{xj, y, s)P{Zn = Xj, Sn G ds\Qn-i = ln-i)fw{y " v{xj))dy. 

The last equality is obtained by the change of variable y = ^{xj) + w and gives the 
result. □ 

Integrating w.r.t. to the first variable in the previous lemma (i.e. summing w.r.t. 
Xj) yields the following result. 

Lemma 3.2. For all'~fn~i G (]R°'x]R+)", we have the following equality of probability 
measures on M.'^ x ]R+, 



P(K„ G dy, Sn G ds\Qn-i = 7n-l) 



iln-l, {Xj}, ds)fw{y - V{Xj)) 



dy. 



Lemma 3.3. For all n > 1, 7„_i G (M'^ x IR+)" and j G {I, . . . ,q}, the distribution 
Hn, defined by Eq. Q, satisfies 



Hn {-in-i,{xj],ds) 

E I E <-Aln-i)K^{x„s))e-^^--'^Q{^{x,,s),x,)\ds 

•m=0 



+ E (vr™i(7n-i)e-^(--'*-)g($(x„,C),^.))'5*?i.(^s)- 



m=l 
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Proof Let h he a, function of B{Eq x ]R+). Since cr(^„_i) = _^ C 5^r„_n the 
law of iterated conditional expectations yields 

E [h{Zn, Sn) Qn-l = 7n-l = E [E Sn) 5^T„_i Qn-1 = 7n-l ■ 

Besides, 5't„_i = cr(Zo, So, Wq, Z„_i, S",,-!, VK„_i) so that 

^ \h{Zn, Sn) dT„-i — ^\h{Zn, Sn) Zq, Sq, . . . , Zn-l, Sn-1 , 

by independence of the sequences (W^n)neN and 5'„)„gN. Now, we apply the 
Markov property of Sn)nGf^ to obtain 



E 



h{Zn, Sn) dT„-i — E h{Zn, Sn) Zn-l, Sn-1 



and finally, a well-known special feature of the transition kernel of the underlying 
Markov chain of a PDMP provides 



E ^h{Zn, Sn) ^Tn-i — E \h{Zm Sn) Zn-l 



Moreover, the transition kernel can be explicitly expressed in terms of the local 
characteristics of the PDMP, and this yields the next equations 

E [h{Zn,Sn)\Gn-l = In 

= E 



E 



^ l{Z^_^=Xi}^ h{Zn-, Sn) Zn-l = Xi Qn-l = 7n-l 
.1=1 

Yl Mz„.,=x,} Yl L K^j^ s)K^{xi, s))e-^^'''''H{s<t*}Q{^{xi, s),Xj)ds 

i=l 7=1 



Sn-1 = 7 



n-1 



E 



/ h{xj, s)J2<-iiln-i)KHxi, s))e-^^'''''h[,^ti}Qmxi, s),xj)ds 



+ Y: h{xj, i*)<-i(7n-i)e-^^^-**)Q($(x„ t*),xj) 

i=l 

This can be written equivalently as 

E [h{Zn, Sn)\Gn-l = 7n- 



i=i 



E ( [y^'h{xj,s) Y <-i(7n-i)A($(x„s))e-^(-"^)g($(x„s),x,))cis 



m=0 \ "1 



i=m+l 



+ Y h{^J, t:)<_i(7n-i)e-^(^-*'*)g($(a;„ t;),Xj) 



i=l 

Hence the result. 



□ 



We now state the main result of this section, namely the recursive formulation 
of the filter sequence (n„)„gN. 
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Proposition 3.4. Lei ^ = (^^ . . . , ^'') : Mi{Eo) xR'^ xR+ ^ Mi{Eq) he defined 
as follows: for all j G {1, . . . , q}, 



q-l 
m=0 



Kiy)' 



where 



^U'^,y,s) = Y T^'KHxi,s))e-^^^-'^Qmxi,s),xj)fw{y - ^{xj)), 

i=m+l 

q 



fc=i 



^*diy) = QmXrn,t*J,Xj)fw{y-^{Xj)), 

Kiy) = t^*J:iy)- 

k=l 

Then, the filter, defined in Eq. satisfies IIq = P(Zo = Xj) and the following 
recursion: for all n>\, 

p-a.s., n„ = *(n„_i,y;,5„). 



Proof Fix 7„_i in 



X 



3 + 



. Bayes formula yields for all j G {1, . . . , g}, 



P(Z„ = Xj.Yn G dy, Sn G ds Qn-l = 7n- 



n Xj 



Qn = (7n-l, S)) X P(Yn G dy, Sn G ds Qn-l = Jn-l) ■ 



Lemmas 3.1 and 3.2 yield 

/^n (7n-i, {xj}, ds)fw{y - ^{Xj))dy 



p(z. 



Xj 



{in-i,y,s) 



J2 /^n (7n-i, {a^fc}, ds)fwiy - ^(Xk)) 

.k=l 



dy. 



With respect to y, one recognizes the equahty of two absolutely continuous measures 
which implies the equality a.e. of the density functions. Thus, one has for almost 
all ?/ G M*^ w.r.t. the Lebesgue measure, 



f^n (7n-l, {Xj}, ds)fw{y - '^{Xj)) 
= P {Zn = Xj Qn = {ln-l,y, s] 



(5) 



1 

Lk=l 



J2 /^n (7n-l) {^k}, ds)fw{y - V{Xk)) 



Eq. ^ states the equality of two measures of the variable s G R^ that contain 
both an absolutely continuous part and some weighted Dirac measures. Denote 
gi{y, s)h'i{ds) (respectively g2{y, s)h'2{ds)) the left-hand (resp. right-hand) side term 
of the previous equality. Eq. (|5| means that for all function F G i?(]R'^) and for 
almost all y eR'^ w.r .t. the Lebesgue measure, one has 



Fis)gi{y,s)ui{ds) = / F{s)g2{y, s)v2{ds) 



(6) 



Recall that, from Lemma 3.3, the distribution /i„ (7„_i, {xj}, (is) has a density on 
the interval ]t^',t^+i[, that we will denote fmi'Jn-iyXj, s), given by 

g 

fmhn-uXj,s)= J2 <-i(7n-i)A($(xi,s))e-^(^-^)g($(xi,s),a;j). 

i=m+l 

First, take F{s) = H{s)l{se]ri^r^^^[} in Eq. Q: for all H G B{R+), one has 
r*m+i 

H{s)fm{7n-1, Xj, S)fw{y " (p{Xj))ds 

ft* 1 
= / H{s)P (Z„ = Xj Qn = (7n-i, y, s)) X! /m(7n-i, ^^fc, s)fw{y " ^^(a;^))^^, 

fc=l 

and thus on ]t^', tl^^-^l, almost surely w.r.t. the Lebesgue measure, one has 

fm (Tw— 1 ) Xj , 
ELi fm{ln~l,Xk, S)fw{y - V{Xk)) ' 



P [Zn = Xj Qn = {ln-l,y, s) 



Finally, for m G {1, . . . choosing F{s) = l{s=t^} in Eq. ^ yields the equality 



of the weights at the point t*^ thus, using Lemma 3.3 



P I Zn — Xj 



Gn = {-fn-Uy,t*^)) 

ELi TT™ i(7n-i)e-^(-™'*^)g($(a;„, t*J, Xk)fw{y - v{xk)) 

Q($(a:^, t*^),Xj)fw{y - vjxj)) 
ELi Q{^{xm, t*^),Xk)fw{y - v{xk)) ' 



Thus there exists two measurable sets Ny C M'^ and A^^ C ]R+\{t*, . . . , t*}, negligible 
w.r.t. the Lebesgue measures on M"^ and M respectively, such that for all 7^-1 G 
(M'^ X y G R'^\Ny, s G R^\Ns, one has 

7r„(7n-i,y, s) = ^(7r„_i(7„„i),y, s). (7) 

On the one hand, we have P(F„ G Ny) < E'j=iP{f{xj) + iy„ G A^'^) = by ab- 
solute continuity of the distribution of Wn- On the other hand, P(5'„ G A^^) = 
because the distribution of Sn is absolutely continuous on M+yjt*, . . . ,t*} and one 
has Ns n{t*, . . . = 0. We therefore conclude from Eq. ([T]) that P-a.s., one has 
TiniOn-ijYn, Sn) = \E'(vr„_i(^„_i), F„, S^). The result follows since P-a.s., one has 
7r„(^„_i, Yn, Sn) = n„ and 7r„_i(^„_i) = n„_i. □ 



This proposition will play a crucial part in the sequel. On the one hand, this 
result will enable us to prove the Markov property of the sequence (n„, Sn)n>o w.r.t. 
the observed filtration. On the other hand, the recursive formulation allows for sim- 
ulation of the process (n„)„>o which is crucial to obtain numerical approximations. 
Finally, notice that the specific structure of the PDMP appears in the recursive for- 
mulation of the filter which contains both an absolutely continuous part and some 
weighted points. 
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4 Dynamic programming 



In this section, we derive the dynamic programming equation for the value function 
of the partially observed optimal stopping problem (|3]). After a preliminary study 
of the structure of the stopping times of S^, the first step consists in converting 
the partially observed optimal stopping problem into an optimal stopping problem 
under complete observation. Then, we introduce some operators to recursively build 
a sequence of function (t>„)o<„<Afthat are the value functions of the optimal stopping 
problems with horizon T/v_„. In particular, vo is the value function of the optimal 
stopping problem ^ we are interested in. We also provide a family of e-optimal 
stopping times. 



4.1 The Markov chain (n„,5'„)n>o 

We start with some technical preliminary results that will be required in the sequel. 
We investigate the Markov property of the filter process and give details on the 
structure of the )t>o-stopping times. 

Proposition 4.1. The sequences (n„, Y^, Sn)neN, (n„, Sn)neN and {T\.n)n&i are (S'r„)neN- 
Markov chains. 



Proof Leth e B{Mi{Eo)xR'^xR+). The law of iterated conditional expectations 
yields 

E[/i(n„, F„, 5„) Id^^J = E [E[/i(n„, F„, 5„) \dT„,,] ■ 
From Proposition 3.4 and Eq. ([T]) which defines Yn one obtains 

= J2 ^(^(n„_l, (p{Xj) +W,S), ip{Xj) + W, 
XP(Z„ = Xj, Wn e dw, Sn e ds\dT„^J. 

Yet, Wn is independent from o"(Z„, Sn) V dr^-i and admits the density function f]y. 



As in the proof of Lemma |3.1| one thus obtains 

E[h{Un,Yn,Sn)\dT„^,] 

j h(^^{Un-i,y,s),y,s^P{Zn = Xj,Sn e ds\dT„-i)fw{y ~ v{xj))dy. 



Besides, we have P(Z„ = Xj,Sn € c?s|5^t„_i) = P(^n = Xj,Sn € ds\Zn-i) as in the 



proof of Lemma 3^, so that one has 

E[hiUn,Yn,Sn)\dT„_,] 

= E1{^„-i=xjE / ( r /^(^(n„_i,y,s),i/,s)A($(x„s))e-^(^-^)g(<l>(x„s),x,)ds 
i=l j=l •' ^ -^o 

+h[^>{nn.i,y,t*),yX^e-^^^-'*^Q{^{xi,t*),x,)^ 
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Take now the conditional expectation w.r.t. ^Tn-i^ obtain 

E[Mn„,y„,^OIi?T„_J 

= En^iE / ( r h{^{Un-i,y,s),y,s)x{^x,,s))e-''^'-'^Q{Hx,,s),x,)ds 
i=i j=i-' ^•'^ ^ 

+/i(vl/(n„_i,y,t*),y,t*)e"^(^-*')Q(<l>(a;,,,tn,a:,))/H/(y-^(x,))d^^ 

Hence E[/i(n„, 5'„)|5^|^^_J is merely a function of n„_i yielding the result for the 
three processes. □ 

4.2 The (^f )t>o-stopping times 

We now turn to the structure of the (i5^^)(>o-stopping times. 

Lemma 4.2. For all n G N, is an {'^J)t>o-stopping time. 

Proof Notice that for all n G N, P(l^ = Yn+i) = 0. This stems from the absolute 
continuity of the distribution of the random variables (VFn)nGN since 

{Yn = Yn+l}C U {Wn - Wn+l = ^{Xi) - ^(Xj)} . 

1<«J<<? 

Hence, for all n G N and t G M"^, one has P a.s. {T„ <t} = {Nt > n} where we de- 
note Nf = J2o<s<t Ifn^Y^-}- The process {Nt)t>o is 5^^-adapted thus {Nt > n} & 
and since the filtration contains the P-nuU sets, one has {T„ < t} G . For all 
n G N, T„ is therefore an (5^^)t>o-stopping time. □ 

We now recall Theorem A2 T33 from [B] concerning the structure of the stopping 
times for point processes and apply it in our case. 

Definition 4.3. Define the filtration {dt)t>o as follows 

rt=(y (l{y„eA}l{T„<4; n > 1, < s < t, A G B{W)) . 

Theorem 4.4. Let a be an {^t)t>Q-stopping time. For all n G N,tThere exists a 
^^^-measurable non negative random variable Rn, such that one has 

a A Tn+i = (t„ + i?„) A Tn+i on {a > T„,}. 

Our observation process {Yt)t>o being a point process that fits the framework 
developed in [B], we apply this Theorem to )i>o-stopping times. 

Proposition 4.5. For all t > 0, one has "^J = "^1. 
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Proof First prove that C 5f . Let A G B(R'^) and < s < t, one has 



{Y,eA}= u {T„<s<r„+i}n{r„G A} gJ^c??. 

Indeed, in the above equation, we used that Tq and Yq are assumed to be determin- 
istic. For the reverse inclusion, let A G B{R'^), n G N* and < s < t. Recall that 
Yn = Yt„. One has {1t„ G A} G since {Yt)t>Q is S'^-adapted and T„ is an (5^nt>o- 



stopping time from Lemma 4.2 Therefore, one has {Yn G A} fl {T„ < s} G i^^f C , 
hence, the result. □ 



We may therefore apply Theorem 4.4 to (i5^f^)f>o-stopping times. 



Theorem 4.6. Let a he an {^J)t>Q- stopping time. For all n eN, there exists a non 
negative random variable Rn, "S^^- measurable such that one has 

a A Tn+i = {Tu + -Rn) A T„+i on {a > T„,}. 

We outline the following result, which is a direct consequence of the above the- 
orem, because it will be used several times in our derivation. 

Lemma 4.7. For all n G N, {T^ < a < T„+i} = {r„ < a} n{S„+i > Rn}. 



Proof Theorem 4.6 states that on the event {Tn < a}, on has a A T„+i = T„ + 



{Rn A Sn+i) SO that, still on the event {Tn < a}, one has {a < T„+i) {Rn < Sn+i). 
We deduce the result from this observation. □ 



We now investigate the effect of the translation operator of the Markov chain 



(Lin; Yn, Sn )„eN on the )f>o-stopping times. Proposition|4. l|states that (n„, F„, Sn)n£N 



is a {^^ )„gN-Markov chain. Let us consider its canonical space Q = {M.i{Eq) x 



. Thus, for uj = {ujq, loi, . . .) G fi, one has (n„, Yn, Sn){(^) 



UJ. 



n- Besides, 



we define the translation operator 



9 : 



(wo,Wl,...) {cJi,U2,...) 



We then define O'^ = Mq and recursively for / > 2, ^' = 6 o 0'- ^. Thus, for all 
n, / G N, one has (II^, Yn, Sn) o 9^ = {Un+i, Yn+u Sn+i)- As Tq = 0, one has 

n n 
Tn°d =^SkoO = ^ Sk+l = Tn+l — Ti. 



k=l 



k=l 



The next results of this section. Lemma 4.8, Propositions 4.11 and 4.12 and Corollary 



|4.13[ are given without proof because their proofs follow the very same lines as in 
|5] from which they are adapted. However, it is important to notice that the results 
from in] cannot be applied directly to our case because the sequence (n„, Yn, Sn)n£N, 
although it is a Markov chain, is not the underlying Markov chain of some PDMP. 



Set now a G S^. From Theorem 4.6, for all n G N, there exists a non negative 
-measurable random variable Rn, such that, on the event {a > Tn}, one has 

a A Tn+l = (Tn + Rn) A Tn 
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Lemma 4.8. Let Rq = Rq and for k >1, R^ = R^l^^^^^^ One has then 

oo 

(J =Y1 ^ri-l A Sn- 



n=l 



Remark 4.9. This lemma proves that in Theorem 4-6, the sequence (-Rn)raeN can 
be replaced by {Rn)neN- Therefore, in the sequel, we will assume, without loss of 
generality that the sequence {Rn)nm satisfies the following condition: for all n G N, 
Rn+i = on the event {Sn+i > Rn}- 

There exists a sequence of real-valued measurable functions {rk)km defined on 
(M'^ X R+)^+^ such that Rk = rk{Qk), where = {Yq, So, . . . , ,Yk, Sk). Indeed, 
'St^ ~ ^O^j^ ^j^j — ^) Rk is S^r^.'^^S'Surable. 

Definition 4.10. Let / G N* and (-R^)fc6N be a sequence of functions defined on (R°'x 
M+)'+ixl] byRl^{^,uj) = r;(7) and fork > I, Ri{-f,uj) = ^+^(7, ^^^(7, w). 

Proposition 4.11. Assume that Ti < a < T/y. For all A; G N, one has then 
Ri{gi, 9^) = Ri+k and a = Ti + a{gi,e^) , with a : {w^ x x ^ M+ defined 

as a(7,u;) = ^Li(7, c^) A Sn{u). 

Proposition 4.12. Let {Un)nm be a sequence of non negative random variables 
such that for all n, Un is ^^^-measurable and Un+i = on {Sn+i > Un}- We define 
U = SJ^i Un^i A Sn- Then U is an {^J)t>Q- stopping time. 

Corollary 4.13. For all 7 G (M*^ x M+)p+\ ^(7, ■) is a {'^Y)t>o- stopping time. 



4.3 Optimal stopping problem under complete observation 

In this section, we show how our optimal stopping problem under partial obser- 
vation for the process {Xt)t>o can be converted into an optimal stopping problem 
under complete observation involving the discrete-time Markov chain (n„, Sn)o<n<N- 
However, this does not correspond to the optimal stopping problem under complete 
observation for PDMP's studied in |H E] because the Markov chain {Un, Sn)n>o is 
not the underlying Markov chain of some PDMP. 

Lemma 4.14. Let a G S-^ and n > 1. For all tt G A^i(-E'o) one has 
E[giX^;,TJ\T^o = ^] 

n-l q 
k=0 1=1 

+ ^E[l|r„<.}^?(x,,)nj,|no = 7r]. 
1=1 
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Proof We split Fi[g{X„/^Tn)\^o = ^r] into several terms depending on the position 
of a w.r.t. the jump times 

n-l q 

E[(7(X.;,Tj|no = 7r] = ^^E[l|T,<.<T,+,}l{z,=.,}^7o$(x„/?fc)|no = 7r] 

fc=0 i=l 

+ EE[l{T„<.}l{z„=xj^?(x.)|no = vr]. 

i=l 

For notational convenience, consider 



Ak,i = ^{n<a<n+i}Mzk=x,}g o Rk] 

Bi = l{T„<a}l{Z„=xJ^(Xi). 



On the one hand, one has E[i?j|5^^^] = g{xi)l{T„<a}^n since {T„ < cr} E 'St„ (see 
for instance p5, p. 298, Theorem T7]). On the other hand, to compute E[Afc^j|^5^y^], 
we use Lemma 14.71 to obtain 



nAk,Mj = l{T,<.}go^x,,Rk)Ems,^^:,R,}l{z,=..}\d^, 



Y 



Details to obtain the third line in the above computations are provided by Lemma A.2 
The result follows. □ 



4.4 Dynamic programming operators 



Based on the decomposition given by Lemma 4.14 , we introduce the dynamic pro- 
gramming operators for the optimal stopping problem (|3|). 



Definition 4.15. The operator H : B{E) B{Mi{Eq) x ]R+) is defined for all 
h e B{E) and {tt,u) E Mi{Eo) x R+ by 

Hh{7l,u) = E[h O <^{Zo,u)l{s,>u}\^0 = T^]- 

The following lemma is straightforward from the distribution of Si. 
Lemma 4.16. For all h G B{E) and (vr, u) G Mi{Eo) x IR+, one has 

i=l 

The function u — ?■ Hh{n, u) is right continuous with left limits, is continuous 
on the intervals ]t*^] t*m+i[ fo'^ all m G {0, . . . , g — 1} and is null for u > t*. For all 
h G B{E), we consider the restriction H"^h of Hh to Mi{Eo) x [t^^; t^^+J extended 
continuously by constants to M-i^Eq) x ]R+. 
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Definition 4.17. For all m & {0, . . . ,q — 1}, we define the operator H"^: B{E) — j- 
B{Mi{Eo) X M+) as follows 

• ifu<tl„ H^h{'K,u)=Hh{TX,t*J, 

• ifu> t*^, H"'h{TT, u) = E!=m+i 7r^e-^(^-"^*^+i)/i o u A C+i). 

Remark 4.18. For all m e {0, . . . ,q — 1} and for all h G B{E), vr G Mi{Eq), the 
function u H'^h{TT, u) is continuous. Moreover, it is constant on [0; t^] and on 
[t*m+i;+oo[ and one has Hh{7T,u) = Emio «)• 

Definition 4.19. The operators I : B{Mi{Eq)) B{Mi{Eo)xR+) , G: B{Mi{Eq)) ^ 
B{Mi{Eq) X M+) and K: B{Mi{Eo)) B{Mi\Eq)) are defined for all v G 
bIMi{Eq)) and (vr, n) G A^i(^o) x M+ hy 

Iv{n,u) = E[v{Ui)l{s,<uAt*{Zo)}\^o = T^], 
Gvi7r,u) = E[t;(ni)l|5,<„}|no = 7r], 
Kv{n) = E[t;(ni)|no = 7r] = Gvin,tl). 



Computations similar to the ones led in the proof of Proposition |4.1| yield devel- 
oped forms for these operators. 

Lemma 4.20. For all v G B{Mi{Eo)) and (vr, m) G Mi{Eo) x M+, one has 

1 ruAt* 



Iv{tt,u) = E^Vq ' (^Ao$(a;„s')e-^(""^') 

JM. j = l 



i=l 
X 



q 



Gviji.u) = Iv{'K,u) + J2'^''^{t;<u}e' 

'{^{71, y', t*)) E Q{Hx,, t*),xj) fw{y' - Vixj))dy'. 



i=l 

i=i 

As for operator H, we split G into a sum of continuous operators. 

Definition 4.21. For allm E {0, . . . , q—l}, we define the operator G"^ : B(M.i{Eo)) — >• 

B{Mi{Eo) X M+) as G'^v{-n,u) = Gv{TT,t;^) if u < t*^ and 

m 



1=1 

X / v{^in,y',t:))j2Q{Hx^,t*),Xj)fw{y'-vix,))dy' 



if u>t*^. In particular, one has 



q-l 



m + l I 

m=0 
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Definition 4.22. The operators J: B{Mi{Eq)) x B{E) B{Mi{Eq) x M+), 
J™.- B{Mi{Eq)) X B{E) B{Mi{Eo) x M+) for all m e {0, . . . , q - 1} and L: 
B{MiiEo)) X B{E) B{Mi{Eo)) are defined for all {v,h) E B{Mi{Eo)) x B{E) 
and (7r,u) G Mi{Eo) x IR+ &y 

J"^(i;,/i)(7r,M) = H'^h{7T,u) + G'''v{'K,u), 
J{v,h){7i,u) = Hh{7i,u) + Gv{iT,u), 
L{v,h){7i) = sup J(f , /i)(7r, u). 

u>0 

In particular, one has 

q-l 

In view of the above definitions, it seems natural to distinguisli wether = t^_^_i 

Definition 4.23. Let M C {0, . . . , q—1} be the set of indices m such that t*^ < t^+i- 

Notice that M is not empty because it contains at least the index since we 
assumed that t\ > Q = t^. As a straightforward consequence of the previous defini- 
tions, one has the following result. 

Lemma 4.24. For {v,h) G B{Mi{Eo)) x B{E) and vr G Mi{Eq), one has 
L(f , /i)(7r) = max I sup J'"(f , /i)(7r, m)} V iCf (vr). 

me A/ 1^ pF., ... r J 

Definition 4.25. We define recursively the sequence {vn)o<n<N of functions from 
A^i(-E'o) onto M as follows 

I v^in) = J2i=i gixi)TT\ 

\ ^^n-i(7r) = L{vn,g){TT), l<n<N. 

The rest of this section is dedicated to proving that Vn is the value function of 
the optimal stopping problem with horizon T/v_„. 

4.5 Recursive construction of the value function and e-optimal 
stopping times 



The two following theorems 4.26 and 4.29 establish that f„ defined above is the value 



function of our partially observed optimal stopping problem with horizon T^-n- 



Moreover, Theorem 4.29 gives an explicit construction of a family of e-optimal stop- 



ping times. This section is adapted from J4j. 

Theorem 4.26. For all 1 < n < N and tt G Mi{Eo), one has 

sup E[g{X^)\no = tt] < VN-rM). 
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Proof Let a G . We prove the theorem by induction. For n = 1, Lemma 4.14 
yields 



1=1 

+EE[l{Ti<<x}^7(xi)ni|no = 7r] = (a) + (6). 
Since Rq is deterministic, we recognize, in the term (a), the the form given in Lemma 



4.16) of operator thus (a) = Hg(-K, Rq). We now turn to the term (b). From 



Lemma [4.71 one has 

(6) = E[l|s,<«„}^(7(x.)ni|no = 7r] 

= E[vN(ni)l{s^<Ro}\^o = T^] = GvNin,Ro). 



Recall that from Definition 4.22 one has J(f at, g) = Hg + Gvn thus, adding (a) and 
(b), one obtains 

E[5((X^ATi)|no = vr] = J{vN,g){'TC,Ro) < supJ{vN,g)iT^,u) 

u>0 

= L{vN,g){T^) =Vn-i{t^)- 

Set now 2 < n < N and assume that for all r G ^n-i^ has E[(7(X,-)|no = tt] < 
VN-{n-i){'^)- Lemma 4.14 yields 

E[(7(X.ATj|no = 7r] 

n— 1 q 
A:=0 i=l 

+ j2nMT„<^}9i^^)K\^0 = 7l]. 
1=1 

As in the case n = 1, the term for = equals Hg{TT, Ro). Notice that the other 
terms are null on {Ti > a}, thus we may factorize 1{Ti<o-} = l{5i<Ro} ^^^^ i?Tr 
measurable. Take now the conditional expectation w.r.t. in these terms to 

obtain 

E[^?(X.^Tj|no = tt] = Hg{n,Ro) + E[Hl|5,</?o}|no = vr], (8) 
where we defined S as 



E 



n-l q 



=1 i=i 



1=1 



We now use the Markov property of the chain Indeed, for k > 1, one has 

Ilfc = Uk-ioO. Moreover, when Ti < a, o ne has, from Proposition 4.11 , Rk = R\_io9 
(indeed, we pointed out in Remark 4.9 that Rk can be replaced by Rk defined in 
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Lemma 



4.8 ) and a = Ti + a 06 where Rl_i and a are defined in Definition 



4.10 



and 



Proposition 4.11 (with / = 1 in the present case). Since for A; > 1, = Ti + Tk_io6 
one has l{Tk<a} 
yields 

■ n-2 q 



'^{Tk-i<^} ° ^- Fiiicilly, the Markov property of the chain {Iik)k>o 



En, 



Ei{T„_i<a}^7(a;*)nLi 



4.14 



we rec- 



In other words, define ^(vr) = 'Ei[g{X~^rp^_^)\IiQ = vr]. Using Lemma 
ognize that S = wijli). Moreover, one has w{t:) < fiv_(„„i)(7r) from the induction 
assumption since a A T„_i G (indeed, both a and T!„_i are (5^f )t>o-stopping 



times from Corollary 4.13 and Proposition 4.2 respectively). One has then 



2 < t;Ar-(„_i)(ni). 

Finally, combining Eq. (|8| and ([9]), one has 

E[^?(X,;,Tj|no = vr] < Hg{7r,Ro) + EK^(„_i)(ni)l|s,<iJo}|no = vr] 
In the second term, we recognize the operator G and one has 

E[^(X,;,Tj|no = 7r] < Hg{TT,Ro) + GvN-in-i){Tr,Ro) 

= J{vN-(n-l),9)iT^,Ro) 

< sup J{vN~(n~l),g){T!',u) 
u>0 

= L(fAr_(„_i),5()(7r) = VN-ni-^), 

that proves the induction. 



(9) 



□ 



Theorem |4.26| proves that f „ is an upper bound for the value function of the 
problem with horizon T/v_„. We now prove the reverse inequality by constructing a 
sequence of e-optimal stopping times. 

Definition 4.27. For e > 0, 1 <n < N and for vr G Mi{Eo), we define 

r^(7r) = inf {m > .■ J{vN-n, g)iTT,u) > fAr-n-l(7r) - e} . 

Consider RIq = '"o(no) ^.n-d for 2 < n < N , 



R: 
R. 
R. 



'n,0 



'n,n— 1 



r, 



r. 



e/(2'=+i) 
n—l—k 
./(2"-i) 




{^k)l{R^^^_^>s,} fori <k< n-2, 

(nn-l)l{iJ' _2>5„_i}, 



and finally set 



k=l 

The following lemma concerns the effect of the translation operator 6 on the 
sequence (-R^^fc)i<„<Af,o<fc<n-i- 

Lemma 4.28. For n > 2 and 1 < k < n — 1, on the set {Ti < U^'^}, one has 



R 



2e 

'n,k' 
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Proof For n = 2, one just has to prove that on the event {Ti < f/l*^}, one has 
R\qO = i?2V Yet, from the difinition of the sequence (-R^ fc)i<n<Af,o<fc<n-i, one 

has R\ q o 9 = rg(ni) and R'^''^ = Tq (ni)l|j:j2e^^>5^|. The resuh follows since we are 

on the event {Ti < t/l*^} = {R'^'^q > ^i}. For a fixed n > 3, we prove the lemma 
by induction onl<A;<'n, — 1. Set k = 1. One has from the definition on the 

sequence (i?^,fc)i<„<7v,o<fc<n-i, K-ifi ° ^ = ^n-2(ni) and Rl'^^ = C_2(ni)l|^2.^>5^|. 
We obtain -R^„i q o = R^^^ because we have assumed that we are on the event 
{Ti < f/^''} = {R^Q > Si}. The propagation of the induction is similar to the case 
A; = l. ' □ 

Equipped with this preliminary result, we may now prove that {U^)i<n<N is a se- 
quence of e-optimal stopping times for the observations filtration. 

Theorem 4.29. For all \ < n < N and e > 0, one has E and 

E[g{Xu^)\Uo = 7r]>VN-n{7r)-e. 

Proof Let n G {1, • • • ,N}. First notice that, as a direct consequence of Propo- 
sition 



4.12 



is an )t>o-stopping time since, by construction, the i?^^ are 
5^1^^ -measurable and satisfy the condition R^ i^ = on the event {Sk > Rn^k-i}- is 
also clear that < Ylk=i = Tn- Thus, one has G E^. Let us now prove the 
second assessment by induction. Set n = 1. Let vr G A^i(i?o), we denote Tq = rQ{n). 
Since R\q = is deterministic, one has clearly R\q E and we may apply Lemma 



4.14 



to a = R\q (and n = 1) which yields 

i=l 

+ ^E[i|5,<ra^7(a;.)ni|no = 7r] = (c) + (rf). 



i=l 



As in the proof of the previous theorem, we recognize respectively the operators H 
and G in the terms (c) and (d). More precisely, (c) = Hg{7i,rQ) and 

{d) = E[l|5,<r.aE^7(x.)ni|no = tt] = E[l|s,<r5}t^iv(ni)|no = tt] = Gvr,{7f,rl), 

i=l 

so that, adding (c) and (d) yields E[g{Xjic ^;,Si)\^o = vr] = J{vn, g){T^,r^o)- Finally, 
the definition of yields J(fiv, 5')(7r, Tq) > fAr-i(vr) — e thus one has 

E[c/(XR.^^;,5j|no = tt] > vn^ti) - e. 

Now set 2 < n < and assume that for all e > 0, one has E[5f(X(/E_J|no = tt] > 



t>7v-{ri-i)(vr) — e. Lemma 4.14 yields 



E[g{Xu^^.)\Uo = n] 



k=0 1=1 



+ EE[l{T„<r/^.}^?(x.)n^|no = 7r]. 



i=l 
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Denote r^_^ = r^_i(vr). As in the case n = 1, the term for = equals Hg {it, r^_i) 
since R'^q = r'^_^{Ilo). Take the conditional expectation w.r.t. Jji the other 
terms. One has then, 



with 



E 



r n-1 Q 
k=l i=l 

+ j2MTr.<U?^^}9{Xi)K\dl 

i=l 



(10) 



We wish to apply the Markov property of (nfc)fcgi!^ in the term H'. Recall that, from 
Lemma 4.28, one has R^-i k-i ° = Rn^k n > 2 and 1<A;<?t, — Ion the event 



{Ti < Un\ {^i ^ -^n^o} (the equality of these events stems from Lemma 4.7). 



Thus, on this set one has 



Ul' = ^1 + E Rn,k~l ^Sk = T^ + E(^n-l,fc-2 ° 0) A {Sk-l 



k=2 



k=2 



Ti + u:_, o e. 



Besides, recall that for k > 1, Tk = Ti + Tk^i o 6 and thus one has '^{Tk<u^^} 
l{j'^_j<(7E J o 6. Therefore, the Markov property of the chain {Ilk)k>o yields 



En, 



■ n-2 q 

E E im<c/^_ ji{i?4_,,,<ta^ ° 

- fc=0 i=l 

q 

i=l 



In other words, define ^'(vr) 



E 



Ho = 7r . Using Lemma 



4.14 



we rec- 



ognize that S' = w'{Ili). Moreover, thanks to the induction assumption, one has 
w'{7i) > f Ar_(n-i)(7r) — £ SO that One obtains 



Ill 



Finally, combining Eq. (10) and (11) and noticing that, according to Lemma 4.7 
{^1 < U^'} = {Si < r^_J, one obtains 

E[g{Xu2.)\Uo = tt] > Hg{'K,r'^_^) + E[vN-in^i){ni)l{s,<r^^,,}\^o = tt] - e 
= J(t;iv-(n-i),^)(7r,r^_i) - e 

> VN-n{T^) - 2e, 

from the definition of r'^n-i- Hence, the result. □ 



Theorems |4.26| and |4.29| establish that f„ is the value function of the problem 
with horizon Tat-^ and in particular that vq is the value function of problem (|3|. 
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Introduce now the sequence (Ki)o<n<Af of random variables defined by V„ = Vnijln)- 
In other words, one has Viv = J2j=ig{xi)Wj^ and for < n < iV — 1, the dynamic 
programming equation yields the recursion 

Vn = sup E[g O ^{Zn,u)l{s„+i>u} + Vn+ll 

u€[0;t*] 

= max I sup E[go<^{Zr„u)l{s„^,yu} + Vn+l'^{S„+i<u}\^n]} 

V E[K+i|n„]. 

Our next goal is to provide a numerical scheme based on a discretization of this 
backward recursion to obtain an approximation of the value function Vq and derive 
a family of e-optimal stopping times that can be numerically computed in practice. 



5 Numerical approximation by quantization 

We are now concerned with numerical approximations. Our approach relies on a 
discretization of the process (n„, Sn)o<n<N, which fully determines our dynamic 
programming equation. We follow the idea introduced in [21 Section 3.1.]. The key 
point is the Markov property of the process (Jin, Sn)o<n<N stated by Proposition 



4.1[ Therefore, it is possible to discretize this chain by quantization as explained 
below. 



5.1 The quantization approach 

There exists an extensive literature on quantization methods for random variables 
and processes. We do not pretend to present here an exhaustive panorama of these 
methods. However, the interested reader may for instance, consult the following 
works [HI El E] and references therein. Consider X an M^'-valued random variable 
such that \\X\\p < oo where ||X||p denotes the L^-nom of X: \\X\\p = (E[\X\P]y/P. 
Let u he a, fixed integer, the optimal L^-quantization of the random variable X 
consists in finding the best possible L^-approximation of X by a random vector X 
taking at most u values: X G {x^, . . . ,x'^}. This procedure consists in the following 
two steps: 

1. Find a finite weighted grid F C with F = {x^, . . . yX"}. 

2. Set X = X^ where X^ = projr{X) with projr denotes the closest neighbour 
projection on F. 

The asymptotic properties of the L^-quantization are given by the following result, 
see e.g. p. 

Theorem 5.1. //E[|X|p+''] < +oo for some rj > then one has 

/ p \ l+p/r 

lim i/P/^min llX-X^r = J„r [ \hr/^'+P\u)du) 
u^oo \r\<v P ^' \J J 

where the distribution of X is Px{du) = h{u)Xr{du) + fj, with /i ± A^, Jp,r a constant 
and Xr the Lebesgue measure in W . 
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There exists a similar procedure for the optimal quantization of a Markov chain. 
Our approximation method is based on the quantization of the Markov chain (11^, Sk)k<N- 
Thus, from now on, we will denote, for < A; < A^, 6^ = (JIk,Sk). The CLVQ 
(Competitive Learning Vector Quantization) algorithm [9] Section 3] provides for 
each time step < A; < a finite grid of A4i{Eq) x M"*" as well as the transition 
matrices {Qk)o<k<N-i from to Tk+i- Let p > 1 such that for all k < N,Ilk and 5*^ 
have finite moments at least up to order p and let projr^^ be the nearest-neighbor pro- 
jection from A4i{Eo) x IR+ onto r^. The quantized process {Qk)k<N = (J^k, Sk)k<N 
with value for each k in the finite grid Tk of Aii{Eo) x ]R+ is then defined by 

(Ilfc, Sk) = projr^ijlk, Sk). 

We will also denote by F^, the projection of F^ on J^i{Eq), and by Ff , the projection 
of Tk on M+. 

Some important remarks must be made concerning the quantization. On the one 
hand, the optimal quantization has nice convergence properties stated by Theorem 
Indeed, the L^-quantization error \\Qk — ©fcUp goes to zero when the number 



5.1 



of points in the grids goes to infinity. However, on the other hand, the Markov 
property is not maintained by the algorithm and the quantized process is generally 
not Markovian. Although the quantized process can be easily transformed into a 
Markov chain, this chain will not be homogeneous. It must be pointed out that the 
quantized process {Qk)k£f>i depends on the starting point Oq of the process. 

In practice, we begin with the computation of the quantization grids, which 
merely requires to be able to simulate the process. Notice that in our case, what 
is actually simulated is the sequence of observation (Yk, Sk)o<k<N- We are then 
able to compute the filter (nfc)o<A;<7v thanks to the recursive equation provided by 



Proposition |3.4[ The grids are only computed once and for all and may be stored off- 
line. Our schemes are then based on the following simple idea: we replace the process 
by its quantized approximation within the different recursions. The computation is 
thus carried out in a very simple way since the quantized process has finite state 
space. 



5.2 Approximation of the value function 

Our approximation scheme of the sequence (V^)o<n<Ar follows the same lines as in 
[5], but once more, the results therein cannot be applied directly as the Markov 
chain {Qk)ken is not the underlying Markov chain of some PDMP. Our approach 
decomposes in two steps. The first one will be to discretize the time-continuous 
maximization of the operator L to obtain a maximization over a finite set. The 
second step consists in replacing the Markov chain (6„)„gN = (n^, Sn)nm by its 
quantized approximation (On)neN = (n„, Sn)n£N within the dynamic programming 
equation. Thus, the conditional expectations will become easily tractable finite 
sums. 

Definition 5.2. Let A > be such that 

A < ^ min { |t* - t* \ with 0<i,j,<q such that t* ^ } • (12) 
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For all m G M , let Grm{^) he the finite grid on [^mj^m+i] defined as follows 

Gr^A) = {t*^ + tA,l<t< z™} U{C+i - A}, 

where im = max{i G N such that t^ + iA < t^^^^ — A}. We also denote Gr{A) 
Gr„(A). 



Remark 5.3. Let m G M. Notice that, thanks to Eq. (12), Grm{A) is not empty. 
Moreover, it satisfies two properties that will he crucial in the sequel: 

a. for all t G [t^; ^m+i], there exists u G Gr^iA) such that \u — t\ < A, 

b. for all u G G'rm(A) and < rj < A, one has [u — rj; u + rj] c]t^; t^+if. 

A discretized maximization operator L'^ is then defined as follows. 

Definition 5.4. Let L'^: B{Mi{Eq)) x B{E) B{Mi{Eq)) he defined for all 
TT G A^i(^o) hy 

L\v, h)in) max { max { J'"(t;, /i)(7r, u)}} V Kvin). 

We now proceed to our second step: replacing the Markov chain (0n)„6N = 
(n„, Sn)nm by its quantized approximation (0„)neN = (n„, Sn)neN within the oper- 
ators involved in the construction of the value function. 

Definition 5.5. We define the quantized operators Gn, K^, Jn O'nd for 
n e {I,. . . ,N}, V e B{Tn), h G B{E), vr G r^^^ andu>{] as follows 

i=l 

Gnv{n,u) = E[t;(n„)l^^^^^-^|n„_i = vr], 
Knv{7r) = E[t;(n„)|n„_i = 7r], 
Jn{v,h){n,u) = Hnh{n,u) + Gnv{n,u), 
L'i(v,h)(iT) = max I max {Jn(v, h)(7c,u)}\ \/ Rnvirc). 

The quantized approximation of the value function naturally follows. 
Definition 5.6. For < n < N , define the functions Vn on as follows 

( vn{t^) = El=ig{xi)7r' for all n G T^, 

\ Vn-i{7i) = c/)(7r) for all tt G T^_^ andl<n<N. 

ForO<n<N, let % = Vnifi-n)- 

We may now state our main result for the numerical approximation. 
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Theorem 5.7. Let A > so that for all < n < N - 1, 

A>(2Ca)-i/21|^„+i-S„h 
then, one has the following bound for the approximation error 



,|l/2 
1+1 lip 



|Ki ~ Ki||p ^ ||Ki+i ~ Ki+i||p + Q.A + follSn+i — S'„+i||p 



+c,||n„ - UnWp + 2K+i]||n„+i - u 



n+l llpj 



where a = [g]2 + 2CgC\, b = 4:Cg{2Cx)^ and Cn = [vn] + ^Cg + 2[i;„4.i] with 
[vn+i\ defined in Proposition B.l and [g]2 defined in Assumption 2.1^ 



Theorem |5.7| estabhshes the convergence of our approximation scheme and pro- 
vides a bound for the rate of convergence. More precisely, it gives a rate for the 
convergence of Vq towards Vq. Indeed, one has HVat — VatUp = || Y!i=i9{'^i){j^^N ~ 



< CjUr 



n 



Af Hp 



so that 



Va-Vn 



quantization errors (||6„ — 6„||p^ 
in the quantization grids goes to infinity. In order to prove Theorem 5.7, we pro- 



0<n<N 



can be made arbitrarily small when the 
go to zero i.e. when the number of points 



ceed similarly to [3] and split the approximation error into four terms — Ki||p ^ 



^3 + 



^4, with 

Si = ||t;„(n„) - Vn(Jln)\\p, 
^2 = \\L{Vn+l,g)ifln) - L'^iVn+l,g){fin 
S3 = \\L\vn+l,g)(fln) -L'^n+liVn+l,g)(Jln)\\p, 



•-n) \\pi 



l^„+iK+i,^7)(n„) - ^;:+i(t^n+i,^7)(n. 



IIP' 



The bound for the first term is straightforward from Proposition B.7 
Lemma 5.8. The first term Si is bounded as follows 

||t;„(n„) - t;„(n„)||p < [i;„]||n„ - n^iip. 

The other error terms are studied separately in the following sections. 
5.2.1 Second term of the error 

For the second error term, we investigate the consequences of replacing the contin- 
uous maximization in operator L by a discrete one on Gr(A). 

Lemma 5.9. For all m e M , v e B{Mi{Eo)) and tt G Mi{Eq) one has 



sup J'^{v,g){'K,u) - ma.x J"'{v,g){7c,u) <{[g]2 + C gCx + C,Cx) A. 
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Proof The function u — J'^i^v, h){7i, u) being continuous, there exists t G [t^; t^+i] 
such that sup„g[j* .J. J™'(f , /i)(7r, m) = J"^{v, h){7i,t). Moreover, from Remark 



5.3 



a, one may chose u G Grm(A) so that |m — t| < A. Propositions B.l and 



B.4 stating the Lipschitz continuity of J'^ then yield 



0< sup J"'{v,h){TT,u) - max J"^{v,h){7T,u) 

< J'^{v,h){7C,t) - J'^{v,h){7C,u) 

< i[9h + CgCx + C^Cx) \t-u\< {[g]2 + CgCx + C^Cx) A. 



Hence, the result. □ 

Lemma 5.10. The second term H2 is bounded as follows 

\\L{vn-,i,9){fln) - L'^K+i, ^)(n„)||p < {[g]2 + 2CgCx) A. 

Proof This is a straightforward consequence of the previous lemma once it has 
been noticed that for all a, 6, c, d G M, one has |a V 6 — c V (i| < \a — c\ \/ \b — d\. 



Notice also that Proposition B.7 provides , ^ < Cg. □ 



5.2.2 Third term of the error 

To investigate the third error term, we use the properties of quantization to bound 
the error made by replacing an operator by its quantized approximation. As in [5], 
we must first deal with non-continuous indicator functions. 



Lemma 5.11. For allO<n<N — l,m^M and < t] < A, one has 



max E[|l 

MeGr™(A) 



n+1 



s, 



n+1 Hp 



+ 2r]Cx 



Proof Let < r/ < A. The difference of the indicator functions equals 1 if and 
only if Sn+i and Sn+i on either side of u. Therefore, if the difference of the indicator 
functions equals 1, either \Sn+i — u\ < rj, or \Sn+i — u\ > rj and in the latter case 
\Sn+i - Sn+i\ > rj too since \Sn+i - Sn+i\ > \Sn+i - u\. One has |l{5„+i<n} - 
l{5„+i<«}l ^ l{|5„+i-s„+i|>,} + M\Sn+^-u\<,}, leading to 



max E 

iieGr„(A) 



- ll^{|S„+i-5„+i|>r?}llp + 



max E[l{|5„+i-„|<^}|n„ 



On the one hand, Markov inequality yields 



II 



{|5„+i-5„+i|>r?}IIP 



P(|5'„+l - Sn+l\ > V)" < ll'S'n+1 - Sn+lWpT] . 



On the other hand, since u G G'rm(A), one has (see Remark 5.3 b) [u — ri;u + 
v] '^]Kni '^m+ii thus Sn+1 has an absolutely continuous distribution on the interval 
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[u — r];u + rj] since it does not contain any of the t*. Besides, recall that 0„ = 
projrA^n), hence, cT(n„) C a(e„) C cr(6„). We also have a{Qn) C ^ 'St^, the 
law of iterated conditional expectations provides 



E[l{|5„+i-«|<r,}|nr, 



E 

E 
E 



u+rj 



u—rj 
1 pu+r] 



A($(Z„,s))rfs 





fin 




fin 



A($(xi, s) Ids 



Finally, one obtains E[l||5'^^^„„|<^}|n„] < 2rjC\, hence, the result. 



□ 



Lemma 5.12. For all < n < N — 1, one has 



'-n 

in 



n+l 



n 



n+l| 



Un] + {2Cg + 2[vn+i])E 



|n„-nj 



Proof One has 



\KVn+l(Jln) - Kn+lVn+l(Iln)\ 

= |EK+i(n„+i)|n„ = Un] - EK+i(fi„+i)|fi„]| 
< |EK+i(n„+i)|n„ = fi„] - EK+i(n„+i)|fi„]| 

+ |EK+i(n„+i) - t;„+i(fi„+i)|fi„]| = (e) + (/) 



in. 



n+l 



n 



n+l 



n. 



. On the 



On the one hand. Proposition B.7 yields (/) < [tin+ijE 

other hand, one has (n^, Sn) = projr„(n„, S'„) so that cr(n„) C cr(n„, Sn), the law 
of iterated conditional expectations gives 



E[t;„+i(n„ 



+1. 



n. 



E 



E[t;„+i(n„+i)|n„,s'„ 



n. 



Moreover, Proposition 4.1 yields that the conditional distribution of n„+i w.r.t. 
(n„, Sn) merely depends on n„ thus one has E[f„+i(n„+i)|n„, Sn] = E[t;„+i(n„+i)|n„] 
One has then 



E 



EK+i(n„+i)|n„ = fi„] - EK+i(n„+i)|n, 



n. 



|E[i^t;„+l(n„) - KVn+l{Iln)\Un] 



We conclude thanks to Proposition |B.5| stating the Lipschitz continuity of operator 
K and Proposition B.7| concerning the properties of the value function and stating 
in particular that Cy^^-^ ^ Cg. □ 



Lemma 5.13. For all < rj < A, an upper bound for the third term H3 is 

||L'^K+i,^7)(fin)-£^+lK+l,^7)(fin)||p 

< [t;„+i]||n„+i 

~ n„^^i lip 

+2Cg{\\Sn+l - Sn+l\\pV'' + 2r/CA). 
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Proof One has 

|L'^K+i,^7)(n„) -L:^+iK+i,^7)(n„)| 

< max| max \J'^{vn+i,g){fln,u) - Jn+i{vn+i,g){fln,u)\} 

y\Kvn+i{fin) - i^'„+iw„+i(n„)|. 

The term involving operator K was studied in the previous lemma. Let u be 
a fixed element of G'rm(A) and define Q;(7r, vr', s') = Z]?=i ^*fl'('^(^i) + 
t>n+i(7r')l{5/<„}. One has then 



I J"'(w„+i,5f)(n„,M) - J„+i(w„+i,5f)(n„,u)| 

= E[a(n„,n„+i,5„+i)|n„ = n„] -E[a(n„,n„+i,^„+i)|n„] 



< A + 



where, proceeding as in Lemma 5.12 



A =|E[a(n„,n„+i,s'„+i) - a(n„,n„+i,5'„+i) n„]|, 



B 



E 



E[«(n„,n„+i,s„+i)|n„ = n„t] -E[«(n„,n„+i,5„+i)|n„ 



Proposition B.7| states that C^„^i < Cg, thus one has 



A < CgE 



|n„ - Ur 



+ K+i]E |n„+i - n„+i 



+2C„E 



(13) 



In the term B, we recognize the operator J*", 5 = E[J™(t>„+i, g)(Jln, w)— J™(w„+i, 5')(n„, M)|nn] 
and from Propositions B.l and |B.4 , one has 



in„ - m 



B <{3Cg + 2[v^+,])E 
We gather the bounds provided by Eq. ( [I3| , (14) and Lemma 5.12 to obtain 
I L\vn+, , g) (n„) - LJ^+i (t;„+i , g) (n„) | 



(14) 



< K+i]E 



in 



n+l 



n 



n+l I 



+ {{4Cg + 2K+i]) V (2Cg + 2K+i]))e 



|n„,-nj 



+2C„ max E 



II 



{5n+l<«} 1{S„ + 1<«} 



We conclude by taking the norm in the equation above and using Lemma 5.11 
to bound the last term. □ 



5.2.3 Fourth term of the error 

Finally, the fourth error term is bounded using Lipschitz properties. 
Lemma 5.14. The fourth term S4 is bounded as follows 

\\Ln+l (Vn+l , g) (fin) " ^^+1 (t^n+1, g) (fin) ||p 

^ [^n+i]||n„+i — n„_i_i||p + ||Ki+i — Ki+i||p- 
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Proof One has 



l^n+iK+i,^)(n„) - L^+i(0„+i,^)(n„)||p 



max max {Hn+i9(^n,u) + G„+it;„+i(n„, m)} V Kn+iVn+i{fln) 
- max max |if„+i5((n„, u) + G„+ii;„+i(n„, m)| V ir„+i?}„+i(n„ 



< 



max max E 

meMueGr™(A) 



{•^n + l^Cu} 



(f„+l(n„+l) -Vn+l{fln+l] 

vE[f;„+i(n„+i) - {;„+i(n„+i)|n„]||^ 
< ||w„+i(n„+i) - w„+i(nn+i)llp- 

We now introduce f„+i(n„_|_i) to split this term into two differences. The Lipschitz 
continuity of Vn+i stated by Proposition B.7 allows us to bound the first term while 
we recognize Vn+i and Vn+i in the second one. 



nj Wp 



I £^+1 {vn+i , g) (n„) - {vn+1 , g) (n, 
< ||^;„+l(^„+l) - t;„+i(n„+i)||p + ||i;„+i(n„+i) - {;„+i(n„+i) 



< K+i] 

Hence, the result. 



n„+i - n 



ra+l 



□ 



5.3 Numerical construction of an e-optimal stopping time 

As in the previous section, we follow the idea of [2] and we use both the Markov 
chain (0„)o<n<Af and its quantized approximation {Qn)Q<n<N to approximate the 
expression of the e-optimal stopping time introduced in Definition 4.27 We check 
that we thus obtain actual stopping times for the observed filtration {dY)t>o and 
that the expected reward when stopping then is a good approximation of the value 
function Vq. For all (vr, s) G Aii{Eo) x IR+ and < n < A^, we denote (vf„,s„) = 
projrA^,s). Let 

s^_„(vr, s) =mm{teGr{A) : J„(w„, fi')(7f„_i, t) = max J„(?;„,5()(7f„_i,n)}. 

ueGr(A) 

For 1 < n < and tt G A^i(i?o), we define 



Let now for n > 1, 



t* if A'„t;„(7f„_i) > max„eGr(A) J„(t;„, fi')(7f„_i, m), 

s^_„(7r,s) otherwise. 




= f„_i(no,S'o), 

= f„_i_fc(nfc, Sk)l0^ k-i>Sk} for 1 < ^ < - 2, 

and set f/„ = J2k=i -Rn,fc-i AS^. The following result is a direct consequence of Propo- 
sition 4.12 It is a very strong result as it states that the numerically computable 
random variables f/„ are actual )t>o-stopping times. 
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Theorem 5.15. For < n < N , Un is an {^J)t>Q- stopping time. 



We now intend to prove that stopping at time Un provides a good approximation 
of the value function Vq. For all vr G J^i{Eq) and < n < we therefore introduce 
the expected reward functions when abiding by the stopping rule {Un)o<n<N and the 
corresponding random variables 



E[g{Xf 



iHo = vr] 



Vn = Vn(Jln) 



Theorem 5.16. Let A > so that for aUO<n<N - 1, 



A > (2Ca) ^^^ll^n+i — Sn+l\ 



1/2 



one has then the following bound for the error between the expected reward when 
stopping at time Un and the value function 



Vr 



n+1 



V. 



n+1 



+ 



Vn-Vn 



+ 



n+1 



n+1 



+'^n||n„ — n„||p + 2[f„+i]||n„+i — n„+i||p 
+&||'S'„+i — s'„+i||y^, 



1/2 



where b = 4Cg( 2Ca) , = QCg + 4[t>„+i], [f„+i] defined in Proposition 



5.7 



It is important to notice that t7Ar(7r) = J2 i=i g{xi) tt'' = vn{t^) and thus Vn = 
Vn- Therefore, the previous theorem proves that |Vo — Vo\ goes to zero when the 
quantization errors (||0n — 0n||p)o<n<Af go to zero. In other words, the expected 
reward Vq when stopping at the random time Un can be made arbitrarily close to 
the value function Vq of the partially observed optimal stopping problem ([s]) and 
hence [/at is an e-optimal stopping time. 



Proof The first step consists in finding a recursion satisfied by the sequence 
{Vn)o<n<N ill order to compare it with the dynami c prog ramming equation giving 
iVn)o<n<N- Let < n < — 1. First of all. Lemma 



4.14 



gives 



Un 
N-n-1 q 



no] 



E EE[i,^,<^_,ia^_„,<,.,^o$(x,,,i?^_„,,)e"^(-'^--)ni,|no] 

fc=0 i=l 

+ EE[i^^_<f;_^^7(x.)nuno]. 



The term corresponding to = in the above sum equals Hg{Ilo, RN-n,o)- Taking 
the conditional expectation w.r.t. in the other terms and noticing that one has 

{Ti < Un-u} = {Si < RN-n,o} yield 



E[g{X-^J\Uo] = Hg{Uo,RN-nfi) + W'l{s, 
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with 



rA^-n-l q 



E 



,1, 



k=l i=l 
1 



+E1 



i=l 



{TjV-n<'^JV-n} 



We now intend to a pply t he Markov property of the sequence (n„)„gN in the term S". 
Similarly to Lemma 4.28 for n > 1, on the set {Ti < f/jv_„}, one has RN_n-i,k-i°d = 



RN-n,k for all 1 < A; < n — 1. Thus, on the set {Ti < f/Tv-n}, one has Un-u 
Ti + Un-ti-i o 0- Recall that Ir^ , 



Irrr ^Pt ^ o 9. We may therefore 
apply the Markov property. Using Lemma 4.14, we now obtain S" = Vn+i{Ili). 



Finally, we have 

VnO^o) = Hg(Ilo,RN-nfl) + GVn+liJlo, RN-n,o) = JiVn+l, g)(Jlo, RN-n,o) ■ 

Recall that RN-n,o = ^Af-n-i(no, 5*0) and apply the translation operator 6*" to obtain 
the following recursion 

We are now able to study the error between Vn and Vn- Let us recall that, from 
its definition, f^^n-iij^n, Sn) equals either s^_„_]^(n„, Sn) or t*. In the latter case, 
notice that J{vn+i, g)(Jin,t*g) = Kvn+i(Jin)- Eventually, one has 

\Vn - Vn\ < l{r^_„_i(n„,s„)=t*}^ + ^{?N-n-iin„,Sn)=7%_^_^(u„,s„)}^ < Ay B, 
with 

iA = \KVn+l(nn) - Kn+lVn+l(nn)\, 

\ B = |J(t;„+i,5()(nn,s^_„_i(n„)) - max„eGr(A) ^n+l(?'n+l,fi')(nn,M)|- 
To bound the first term A, we introduce the function Vn+i- One has 

A < \KVn+l{Iin) - KVn+l{Iin)\ + \KVn+l{Iin) " KVn+l{fin)\ 

+ \KVn+l{f^n) - Kn+lVn+l{fln)\ + \Kn+lVn+l{f\-n) " i^n+l^^n+l (fin) | 
< {g) + {h) + il) + (j). 

In the above sum, the term (g) is bounded by E 



Vn+l — Vr, 



n+l\ 



(h), we use Proposition B.5 stating the Lipschitz continuity of the operator K. 



For the term 



The term (i) is bounded by Lemma 5.12 and the term (j) is bounded in the proof 



of Lemma 5.14 We now turn to the second term B. In the following computa- 



^^_^{Iln,Sn). Its definition yields B = \J{vn+i, g){'n.n,s* 



tions, denote s* 

Jn+i{vn+i, g)(Jln, Ouce again, we introduce the function Vn+i and since one has 



\J{Vn+l,g)(Jln,S*) - J{Vn+l,g)(Jln,S*)\ < E \V n+1 - K+1 



we only have to bound \ J{vn+i, g)(n.n, s*) — Jn+i{vn+i, fi')(n„, We proceed as for 
operator K and introduce the quantities J{vn+i, g)(Jln, s*) and Jn+i{vn+i, fi')(n„, s*). 
We obtain three differences that are bounded using Propositions B.l and B.4| for the 



first one, and with similar arguments as in Lemmas 5.13 and 5.14 respectively for 
the second and the third ones. □ 
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6 Numerical example 



We apply our procedure to a simple PDMP similar to the one studied in [3]. Let 
E = [0; 1[. For x & E and t > 0, the flow is defined by = x + vt so that 

t*{x) = {1 — x)/v. We set the jump rate to X{x) = ax for some a > and the 
transition kernel Q{x, ■) to the uniform distribution on a finite set Eq C E. Thus, 
the process evolves toward 1 and the closer it gets to 1, the more likely it will jump 
back to some point of Eq. A trajectory is represented in Figure [TJ The observation 




1 2 3 4 5 



Figure 1: A trajectory of the process drawn until the 9**^ jump time with a = 3, 
V = 1 and Eq = {0; |; |}. The dotted lines represent the possible post-jump values. 



process is Yn = ^{Zn) + Wn where ^{x) = x and Wn ~ A/'(0, a^) for some > 0. 
Finally, we choose the reward function g{x) = x. Our assumptions thus clearly hold. 
Simulations are run with a = 3, v = 1, Eq = {0; 1/4; 1/2}, o"^ = 0.25 and = 9. 
The exact value of Vq is unknown but one has as in [Sj , 



Vq = E[g{X^)] <Vq= sup E[g{X^)] < E 



sup 

0<i<TAr 



(15) 



Both the first and the last term may be estimated by Monte Carlo simulations. One 
has thus, with 10^ trajectories, E[supo<f<'p^ 9{^t)] = 0.9944. The values of V^o, that 
depend on the quantization grids, are also obtained with 10^ Monte Carlo simulations 
and are gathered in Table [l] as well as the approximation Vp o f Vq and the theoretical 

This bound decreases 



5.7 



bound Bth of the error \Vq — Vq\ provided by Theorem 
as the number of points in the quantization grids increases, as expected. Moreover, 
Eq. (15) provides an empirical bound -Bgm = max ||V^o ~ |E[suPq 

Vol}- 



A Computation of a conditional expectation 



The objective of this section is to prove the technical Lemma [A.2| used in the proof 
of Lemma 4.14 First, recal some classical result, see e.g. |12] . 
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Quantization grids 


A 




Vo 










50 points 


nil 7o 
u. ii ( y 


0.7900 


n 
U. 


oioo 


U.ioi 


Ooo 


100 points 


n no7n 
U.uy ( U 


0. 


8031 


n 
U. 


oZOU 


U.ioy 


A R7 
40 ( 


300 points 


U.U i6i 


0. 


8182 


U. 




U.io4 


071 


500 points 


U.UD04 


0. 


8250 


n 
U. 




n 1 /1 7 




1000 points 


0.0535 


0. 


8313 


0. 


8545 


0.140 


152 


2000 points 


0.0453 


0. 


8361 


0. 


8599 


0.135 


110 


4000 points 


0.0381 


0. 


8408 


0. 


8643 


0.130 


80 


6000 points 


0.0345 


0. 


8430 


0. 


8666 


0.128 


67 


8000 points 


0.0321 


0. 


8479 


0. 


8725 


0.122 


58 


10000 points 


0.0303 


0. 


8497 


0. 


8742 


0.120 


53 



12000 points 0.0290 0.8521 0.8771 0.117 49 



Table 1: Simulation results. The terms i?em and Bth resp ecti vely denote an empirical 



bound and the theoretical bound provided by Theorem 5.7 for the error |Vb — Vo 



Theorem A.l. Let X and Y be real-valued integrable random variables defined on a 
probability space {Q, A, P) with values respectively in two measurable spaces Fi and 
F2. Let B be a sub a -field of A such that X is B-measurable and Y is independent 
from B. Let f G B{Fi x F2), one has then 

nf{X,Y)\B]=f{X), 

where J{x) = E[f{x,Y)]. 

Lemma A.2. For all keN, one has E[l{s,+,>R,}\dn] = l{R,<t*{z,)}e-^^^'"^'^l 

Proof First recall some results concerning the random variables {Sk)k(^]^, details 
may be found in pQ. After a jump of the process to the point z & E, the survival 
function of the time until the next jump is 

r 1 if t < 0, 

(j){t,z) = I e-^("'*) ifO <t<t*{z), 
[ if t > t*{z). 

Define its generalized inverse ip{u^z) = inf{t > such that (t>{t,z) < u}. Then, for 
all G N, one has Sk+i = Zk), where are i.i.d. random variables with uni- 

form distribution on [0; 1] and independent from '^Tk- Thus, one has E[l{5j.^j>^^}|5j'j.] 
E[f{Tk,Zk,Rk)\dn] where f{u,z,r) = l{^(u,z)>r}- As {Zk,Rk) is S^t^ -measurable 



Tfc is independent from and E[l{^(Xfe,2)>r}] = l{r<t-(^)}e ^^"^'''^ Theorem A.l 



yields the result. □ 



B Lipschitz properties 

In this section, we derive the Lipschitz properties of operators H"^, I, G™, K and 
L in order to obtain them for the value functions {vn)o<n<N- 
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Proposition B.l. Form G M, ((vr, m), (vr, -u)) G {M.i{Eq) x ]R+)^, one has 
\H^g{7i,u) - H^g{Ti,u)\ < CJtt - 7r| + {[g]^ + CgC))\u - u\. 

Proof Since the function u — )■ H"^h{TT,u) is constant on the intervals [0;t^] and 
[^m+i;+oo[, we may assume that u, u E [^mj^m+i] ^^^^ one has H"^g{7r,u) = 
Y.i=m+i 7r*e~^*-^""^^ o u), and similarly for H'^g{TT, u). Then, on the one hand, 
one has 



\H"^gin,u)-H^gin,u)\ = \ {tt' - r)e-^^^--^ g o <^{x,,u] 

i=m+l 



< c, y: 

i=m+l 



On the other hand. Lemma A.l in |5] yields 

|e-^(--«)^ o - e-^(^-")<? o < (^2 + CgC^)\u - u\. 

Hence, the result. 



□ 



The following technical lemma will be useful to derive the Lipschitz properties 
of the operator /. The first part of its proof is adapted from [2] . 

Lemma B.2. For all tt, vf G A4i{Eo) and m G M , one has 

9-1 rtl 



E 



m=0 ''"I 



\^{7r,y',s') - ^{tt, y', s')\^m{7^, y', s')dy'ds' <2\7r-n\. 



Proof Let s' G]tJ^;t^+i[ and y' G M.'^. In the following computation, we denote 
T = (tt, y', s') and f = (vr, y', s'), one has 



q 

= E 



< E h^iM - + E #7^ 

i=l ' ' j=l 



Notice that Z]j=i ^m('^) = ^m(T) so that the second sum above reduces to |\E'm(r) — 
^m(r)| = EU - Finally, one has 



|vl/(r) - vl/(f)|vl/„(r) < 2Y\^Ur) - 



r . 
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As /jjd fw{y' — '^{^j))dy' = 1 and Y!j=i Q{^{xi^ s'),Xj^ = 1, one obtains 



E L *(7r,2/',s')-*(^,2/',«') '^m{^,y\s')dy'ds' 



m=0 "» i=l 
9-1 9 rt*,<l 



< 2E E /;"ELk-^iAW^.,«'))e 
xQ($(a;i, s'),Xj)fwiy' - ip{xj))dy' ds' 

m=Oi=m+l 

We obtain the result as § A($(xi, s'))e-^^'''^''Ms' = 1 - e-^(^-**) < 1. 



□ 



Proposition B.3. For v e BL{Mi{Eo)) and ((tt, m), (tt, «)) e {Mi{Eo) x 
one has 

\Iv{'K,u) - Iv{tt,u)\ < (C^ + 2['y])|7r-7r| + C^Cx\u-u\. 



Iv{'K,u) - Iv((K,u)\ < E^*h A t* - M A t* CyCx < CyCx\u - u 



Proof On the one hand, one clearly has 

1 

i=l 

On the other hand, one has 
|/v(7r, u) — Iv{Tt, u)\ 

ft* 



< aiTT - 7f I + E tt' ' |i;(^(7r, y', s')) v[^{Tr, y', s')) 

Besides, we have assumed that v is Lipschitz continuous so that one has 

>{^{7r,y',s'))-v(^{7r,y',s'))\ < [v]\^{7r,y' , s') - -^{Tr^y' , s') 
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Thus, one has 

\Iv{TT,y,s,u) - Iv{Tr,y,s,u)\ 



< C^ln - 7r\ + [v]j2 



i=l 



q-l q 



m=Oi=m+l ™ 



X 



m=0 "I 



^m(vr,?/', s)dy'ds. 



The previous lemma provides the result. 



□ 



Proposition B.4. Form E M, v e BL{Mi{Eo)) and ((vr, u), (tt, n)) G {Mi{Eo) x 
]R+)^, one has 

\G"'v{'K,u) -G'^v{tt,u)\ < {2C^ + 2[v])\7i - 7i\ + C^Cx\u - u\. 



Proof As in the proof of Proposition [RT| we may assume without loss of generality 
that u, u E [^mS^m+i] so that one has 



i -A{xi,t*) 



i=l 



and similarly for G"^v{7t,u). The second term does not depend on u thus 

\G"'v{7T,u) ~G"'v{n,u)\ = \Iv{'K,u) - Iv{7c,u)\ 

< |/f (tt, n) — If (tt, m)| + Ct,|7r — 7r|, 



as '^{n,y',t*) = \E'(7r, y', t*) by Proposition 3.4 This yields the result 



□ 



Proposition B.5. For all v G BL{Mi{Eo)) and {it, it) G Mi{EQy, one has 

\Kv{tt) - Kv{tt)\ < (2^ + 2H)|7r - vf |. 

Proof The proof is similar to the previous one since Kv{7c) = Gv{'K,t*). □ 

Proposition B.6. For v G BL{M.i{Eq)) and (vr, tt ) G M.i{Eq)'^ , one has 
\L{v,g){T^)-L{v,g){T,)\ < (C^ + 2^ + 2H) |7r - 7r|. 
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Proof One has 



\L{v,g){Ti)-L{v,g){Tx)\ 



< max I sup \J"'{v,g){7i,u) - J"'{v,g){fc,u)\]y \Kv{7r) - Kv{ 



< {Cg + 2C, + 2[v])\n 



TT 



using Propositions B.l, B.4 and B.5 since J^{v,g) = H^g + C^v. 



□ 



Proposition B.7. For all n e {0, . . . , N}, one has Vn e BL{Mi{Eq)) with Cy^ < 
Cg and [vn] < (2^-''+^ -3)Cg. 



Proof We proved that f„ is the value function of the optimal stopping problem 



tt] < Cg. There- 



with horizon T^^n thus one has f„(7r) = sup^-gj^y 'E\g{X^) 

N — n 

fore Vn is bounded and C^^ < Cg. The second assessment is proved by backward 
induction. Let vr, vf G A^i(i^^o)- One has 

N 

\vNin) - VN{n)\ < J29ixj)\TT^ - tt^ < Cg\n - n\. 
i=i 

Therefore, we have the result for n = N with [vn] < Cg. Moreover, since f„ = 
L{yn+i,g) for < n < — 1, Proposition B.6 yields < ?>Cg + 2[i;„+i] which 
proves the propagation of the induction. □ 
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